Hartigan's Method: k-means Clustering without Voronoi

نویسندگان

  • Matus Telgarsky
  • Andrea Vattani
چکیده

Hartigan’s method for k-means clustering is the following greedy heuristic: select a point, and optimally reassign it. This paper develops two other formulations of the heuristic, one leading to a number of consistency properties, the other showing that the data partition is always quite separated from the induced Voronoi partition. A characterization of the volume of this separation is provided. Empirical tests verify not only good optimization performance relative to Lloyd’s method, but also good running time.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

CTVN: Clustering Technique Using Voronoi Diagram

Clustering technique is one of the most important and basic tool for data mining. We propose a new clustering technique using K-Means algorithm and Voronoi diagram to unfold the hidden pattern in a given dataset. In the first phase we use K-Means algorithm to create set of small clusters and in the next phase using Voronoi diagram we create actual clusters. K-Means and Voronoi diagram based clu...

متن کامل

Hartigan's K-Means Versus Lloyd's K-Means - Is It Time for a Change?

Hartigan’s method for k-means clustering holds several potential advantages compared to the classical and prevalent optimization heuristic known as Lloyd’s algorithm. E.g., it was recently shown that the set of local minima of Hartigan’s algorithm is a subset of those of Lloyd’s method. We develop a closed-form expression that allows to establish Hartigan’s method for k-means clustering with an...

متن کامل

An automated method for gridding and clustering-based segmentation of cDNA microarray images

Microarrays are widely used to quantify gene expression levels. Microarray image analysis is one of the tools, which are necessary when dealing with vast amounts of biological data. In this work we propose a new method for the automated analysis of microarray images. The proposed method consists of two stages: gridding and segmentation. Initially, the microarray images are preprocessed using te...

متن کامل

FUZZY K-NEAREST NEIGHBOR METHOD TO CLASSIFY DATA IN A CLOSED AREA

Clustering of objects is an important area of research and application in variety of fields. In this paper we present a good technique for data clustering and application of this Technique for data clustering in a closed area. We compare this method with K-nearest neighbor and K-means.  

متن کامل

Persistent K-Means: Stable Data Clustering Algorithm Based on K-Means Algorithm

Identifying clusters or clustering is an important aspect of data analysis. It is the task of grouping a set of objects in such a way those objects in the same group/cluster are more similar in some sense or another. It is a main task of exploratory data mining, and a common technique for statistical data analysis This paper proposed an improved version of K-Means algorithm, namely Persistent K...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2010